AITopics | dependency parser

Collaborating Authors

dependency parser

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Fundamental Algorithm for Dependency Parsing (With Corrections)

Covington, Michael A.

arXiv.org Artificial IntelligenceOct-24-2025

Abstract-This paper presents a fundamental algorithm for parsing natural language sentences into dependency trees. Unlike phrase-structure (constituency) parsers, this algorithm operates one word at a time, attaching each word as soon as it can be attached, corresponding to properties claimed for the parser in the human brain. This paper develops, from first principles, several variations on a fundamental algorithm for parsing natural language into dependency trees. This is an exposition of an algorithm that has been known, in some form, since the 1960s but is not presented systematically in the extant literature. Unlike phrase-structure (constituency) parsers, this algorithm operates one word at a time, attaching each word as soon as it can be attached. There is good evidence that the parsing process used by the human mind has these properties [1].

artificial intelligence, natural language, parser, (16 more...)

arXiv.org Artificial Intelligence

2510.19996

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project

Aquino, Angelina A., Miranda, Lester James V., Or, Elsie Marie T.

arXiv.org Artificial IntelligenceMay-28-2025

This paper presents UD-NewsCrawl, the largest Tagalog treebank to date, containing 15.6k trees manually annotated according to the Universal Dependencies framework. We detail our treebank development process, including data collection, pre-processing, manual annotation, and quality assurance procedures. We provide baseline evaluations using multiple transformer-based models to assess the performance of state-of-the-art dependency parsers on Tagalog. We also highlight challenges in the syntactic analysis of Tagalog given its distinctive grammatical properties, and discuss its implications for the annotation of this treebank. We anticipate that UD-NewsCrawl and our baseline model implementations will serve as valuable resources for advancing computational linguistics research in underrepresented languages like Tagalog.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.20428

Country:

Europe (1.00)
Asia > Middle East (0.67)
North America > United States > Minnesota (0.28)
Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Leisure & Entertainment > Sports > Basketball (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

A Systematic Comparison of Syntactic Representations of Dependency Parsing

Wisniewski, Guillaume, Lacroix, Ophélie

arXiv.org Artificial IntelligenceMar-10-2025

We compare the performance of a transition-based parser in regards to different annotation schemes. We pro-pose to convert some specific syntactic constructions observed in the universal dependency treebanks into a so-called more standard representation and to evaluate parsing performances over all the languages of the project. We show that the ``standard'' constructions do not lead systematically to better parsing performance and that the scores vary considerably according to the languages.

dependency, representation, transformation, (14 more...)

arXiv.org Artificial Intelligence

2503.07142

Country:

Europe > Czechia > Prague (0.05)
Europe > Sweden > Uppsala County > Uppsala (0.05)
Europe > Denmark > Capital Region > Copenhagen (0.05)
(9 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Better Benchmarking LLMs for Zero-Shot Dependency Parsing

Ezquerro, Ana, Gómez-Rodríguez, Carlos, Vilares, David

arXiv.org Artificial IntelligenceFeb-28-2025

While LLMs excel in zero-shot tasks, their performance in linguistic challenges like syntactic parsing has been less scrutinized. This paper studies state-of-the-art open-weight LLMs on the task by comparing them to baselines that do not have access to the input sentence, including baselines that have not been used in this context such as random projective trees or optimal linear arrangements. The results show that most of the tested LLMs cannot outperform the best uninformed baselines, with only the newest and largest versions of LLaMA doing so for most languages, and still achieving rather low performance. Thus, accurate zero-shot syntactic parsing is not forthcoming with open LLMs.

baseline, computational linguistic, linguistic, (14 more...)

arXiv.org Artificial Intelligence

2502.20866

Country:

Asia > Singapore (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
North America > Canada > Ontario > Toronto (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Causal Graphical Models for Vision-Language Compositional Understanding

Parascandolo, Fiorenzo, Moratelli, Nicholas, Sangineto, Enver, Baraldi, Lorenzo, Cucchiara, Rita

arXiv.org Artificial IntelligenceDec-12-2024

Recent work has empirically shown that Vision-Language Models (VLMs) struggle to fully understand the compositional properties of the human language, usually modeling an image caption as a "bag of words". As a result, they perform poorly on compositional tasks, which require a deeper understanding of the different entities of a sentence (subject, verb, etc.) jointly with their mutual relationships in order to be solved. In this paper, we model the dependency relations among textual and visual tokens using a Causal Graphical Model (CGM), built using a dependency parser, and we train a decoder conditioned by the VLM visual encoder. Differently from standard autoregressive or parallel predictions, our decoder's generative process is partially-ordered following the CGM structure. This structure encourages the decoder to learn only the main causal dependencies in a sentence discarding spurious correlations. Using extensive experiments on five compositional benchmarks, we show that our method significantly outperforms all the state-of-the-art compositional approaches by a large margin, and it also improves over methods trained using much larger datasets.

caption, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.09353

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)

Add feedback

Thai Universal Dependency Treebank

Sriwirote, Panyut, Leong, Wei Qi, Polpanumas, Charin, Thanyawong, Santhawat, Tjhi, William Chandra, Aroonmanakun, Wirote, Rutherford, Attapol T.

arXiv.org Artificial IntelligenceMay-13-2024

Automatic dependency parsing of Thai sentences has been underexplored, as evidenced by the lack of large Thai dependency treebanks with complete dependency structures and the lack of a published systematic evaluation of state-of-the-art models, especially transformer-based parsers. In this work, we address these problems by introducing Thai Universal Dependency Treebank (TUD), a new largest Thai treebank consisting of 3,627 trees annotated in accordance with the Universal Dependencies (UD) framework. We then benchmark dependency parsing models that incorporate pretrained transformers as encoders and train them on Thai-PUD and our TUD. The evaluation results show that most of our models can outperform other models reported in previous papers and provide insight into the optimal choices of components to include in Thai dependency parsers. The new treebank and every model's full prediction generated in our experiment are made available on a GitHub repository for further study.

dependency, parser, treebank, (16 more...)

arXiv.org Artificial Intelligence

2405.07586

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Slovenia (0.04)
Europe > Italy (0.04)
(6 more...)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Empirical Analysis for Unsupervised Universal Dependency Parse Tree Aggregation

Kulkarni, Adithya, Eulenstein, Oliver, Li, Qi

arXiv.org Artificial IntelligenceApr-3-2024

Dependency parsing is an essential task in NLP, and the quality of dependency parsers is crucial for many downstream tasks. Parsers' quality often varies depending on the domain and the language involved. Therefore, it is essential to combat the issue of varying quality to achieve stable performance. In various NLP tasks, aggregation methods are used for post-processing aggregation and have been shown to combat the issue of varying quality. However, aggregation methods for post-processing aggregation have not been sufficiently studied in dependency parsing tasks. In an extensive empirical study, we compare different unsupervised post-processing aggregation methods to identify the most suitable dependency tree structure aggregation method.

aggregation, parser, treebank, (16 more...)

arXiv.org Artificial Intelligence

2403.19183

Country:

North America > United States > Iowa (0.04)
Europe > Hungary > Csongrád-Csanád County > Szeged (0.04)
Europe > Finland > Southwest Finland > Turku (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

BibRank: Automatic Keyphrase Extraction Platform Using~Metadata

Eldallal, Abdelrhman, Barbu, Eduard

arXiv.org Artificial IntelligenceOct-13-2023

Automatic Keyphrase Extraction involves identifying essential phrases in a document. These keyphrases are crucial in various tasks such as document classification, clustering, recommendation, indexing, searching, summarization, and text simplification. This paper introduces a platform that integrates keyphrase datasets and facilitates the evaluation of keyphrase extraction algorithms. The platform includes BibRank, an automatic keyphrase extraction algorithm that leverages a rich dataset obtained by parsing bibliographic data in BibTeX format. BibRank combines innovative weighting techniques with positional, statistical, and word co-occurrence information to extract keyphrases from documents. The platform proves valuable for researchers and developers seeking to enhance their keyphrase extraction algorithms and advance the field of natural language processing.

algorithm, dataset, keyphrase extraction algorithm, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/info14100549

2310.09151

Country:

Europe > Estonia > Tartu County > Tartu (0.05)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

Hexatagging: Projective Dependency Parsing as Tagging

Amini, Afra, Liu, Tianyu, Cotterell, Ryan

arXiv.org Artificial IntelligenceJun-8-2023

We introduce a novel dependency parser, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. In contrast to many approaches to dependency parsing, our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other. Additionally, exact decoding is linear in time and space complexity. Furthermore, we derive a probabilistic dependency parser that predicts hexatags using no more than a linear model with features from a pretrained language model, i.e., we forsake a bespoke architecture explicitly designed for the task. Despite the generality and simplicity of our approach, we achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set. Additionally, our parser's linear time complexity and parallelism significantly improve computational efficiency, with a roughly 10-times speed-up over previous state-of-the-art models during decoding.

computational linguistic, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.05477

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
(17 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Quinductor: a multilingual data-driven method for generating reading-comprehension questions using Universal Dependencies

Kalpakchi, Dmytro, Boye, Johan

arXiv.org Artificial IntelligenceMay-12-2023

We propose a multilingual data-driven method for generating reading comprehension questions using dependency trees. Our method provides a strong, mostly deterministic, and inexpensive-totrain baseline for less-resourced languages. While a language-specific corpus is still required, its size is nowhere near those required by modern neural question generation (QG) architectures. Our method surpasses QG baselines previously reported in the literature and shows a good performance in terms of human evaluation. 1 Introduction We are interested in question generation (QG) - the task of automatically generating reading comprehension questions and their correct answers from given declarative sentences. Numerous methods have been proposed for solving this task, most of which have been aimed at the English language. Recent methods are based on neural networks and rely on the availability of large-scale datasets, such as SQuAD (Rajpurkar et al. 2016) - a question-answering dataset repurposed for QG - or large-scale pretrained models, such as GPT-3 (Brown et al. 2020). Early methods, mostly based on context-free grammars, relied on the strict word order and the limited inflectional morphology of English. These traits made it relatively straightforward to craft handwritten templates based on these grammars. The above mentioned idiosyncracies and the unique availability of large-scale resources for English leave a number of open challenges for developing QG methods applicable to languages other than English. The first challenge is the lack of large-scale training datasets, and a prohibitively high cost of obtaining such resources. State-of-the-art QG methods for English train their models on the previously mentioned SQuAD dataset, which contains more than 100,000 questions. Obtaining a good-quality dataset of a similar size is very expensive, especially for languages with fewer native speakers around the world. The second challenge is knowing how well available methods developed for English would generalize to other languages, especially synthetic ones with richer inflectional morphology and less strict word order (e.g., Finnish, Turkish or Russian). To the best of our knowledge, not much research has been done on QG for these kinds of languages. The third challenge is assessing the obtained performance results.

machine learning, natural language, question answering, (23 more...)

arXiv.org Artificial Intelligence

2103.10121

Country:

South America > Brazil (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre:

Workflow (0.93)
Research Report (0.63)

Industry: Education > Assessment & Standards > Student Performance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.91)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback